Optimization of ultrasound-aided extraction of bioactive ingredients from Vitis vinifera seeds using RSM and ANFIS modeling with machine learning algorithm

Plant materials are a rich source of polyphenolic compounds with interesting health-beneficial effects. The present study aimed to determine the optimized condition for maximum extraction of polyphenols from grape seeds through RSM (response surface methodology), ANFIS (adaptive neuro-fuzzy inference system), and machine learning (ML) algorithm models. Effect of five independent variables and their ranges, particle size (X1: 0.5–1 mm), methanol concentration (X2: 60–70% in distilled water), ultrasound exposure time (X3: 18–28 min), temperature (X4: 35–45 °C), and ultrasound intensity (X5: 65–75 W cm−2) at five levels (− 2, − 1, 0, + 1, and + 2) concerning dependent variables, total phenolic content (y1; TPC), total flavonoid content (y2; TFC), 2, 2-diphenyl-1-picrylhydrazyl free radicals scavenging (y3; %DPPH*sc), 2,2′-azino-bis(3-ethylbenzothiazoline-6-sulfonic acid) free radicals scavenging (y4; %ABTS*sc) and Ferric ion reducing antioxidant potential (y5; FRAP) were selected. The optimized condition was observed at X1 = 0.155 mm, X2 = 65% methanol in water, X3 = 23 min ultrasound exposure time, X4 = 40 °C, and X5 = 70 W cm−2 ultrasound intensity. Under this situation, the optimal yields of TPC, TFC, and antioxidant scavenging potential were achieved to be 670.32 mg GAE/g, 451.45 mg RE/g, 81.23% DPPH*sc, 77.39% ABTS*sc and 71.55 μg mol (Fe(II))/g FRAP. This optimal condition yielded equal experimental and expected values. A well-fitted quadratic model was recommended. Furthermore, the validated extraction parameters were optimized and compared using the ANFIS and random forest regressor-ML algorithm. Gas chromatography-mass spectroscopy (GC–MS) and liquid chromatography–mass spectroscopy (LC–MS) analyses were performed to find the existence of the bioactive compounds in the optimized extract.

Grape pomace is used to recover a wide range of products, including ethanol, tartrates, citric acid, grape seed oil, hydrocolloids, bioactive compounds, and dietary fibre.Grape pomace is one of the significant research areas in the field of fibre extraction, particularly pectin 3 .Importantly, grape seeds are a cost-effective source of antioxidant and potential therapeutic compounds in the form of polyphenols 4 .Grape seeds are described to consist of 11% protein, 35% fibre, 3% minerals, 7% water, 7-20% lipids, and 7% polyphenolic compounds (especially tocopherols and β-carotene) 5 .Polyphenols and other phenolic compounds are gaining attention from scientists because of their potential benefits for human health 6 .Polyphenolic compounds have the property of neutralizing over-generated free radicals (reactive oxygen species (ROS), reactive nitrogen species (RNS), and DNA reactive aldehyde (DRA)) 7 .Free radicals are typically generated as a by-product of oxygen metabolism, and mitochondria release it.Free radicals play a dual role; at a low level, they are vital for many cellular signaling mechanisms (i.e., regulate cellular events, like cell cycle, proliferation, migration, and programmed cell death) 8 .In contrast, at a high level, they lead to several pathological complications including damage to protein, nucleic acids, cell, and lipid membrane disturbances, and reduced cellular viability 9 .
Remarkably, an elevated level of ROS causes oxidative stress and the loss of antioxidant and detoxifying enzymes in cells and tissues, as well as oxidative stress itself 9 .An imbalance between a biological system's capacity to detoxify these reactive chemicals and generating and accumulating reactive oxygen species (ROS) in cells and tissues causes oxidative stress 10 .Numerous studies have demonstrated that oxidative stress and depletion of antioxidant enzymes might have a role in developing and progressing several diseases (such as cancer, diabetes mellitus, cardiovascular diseases (coronary heart disease, atherosclerosis), metabolic disorders, arthritis, and neurodegenerative disorders) [11][12][13][14] .Recently, bioactive ingredients from grape seeds gained more attention due to their therapeutic importance 15 .In addition, grape seed powder is a nutraceutical agent usually consumed as a well-being/dietary supplement and sold over-the-counter products in the United States of America 16 .Grape seeds possess numerous polyphenolic compounds, including flavan-3-ols, which act to prevent various diseases 17 .The flavan-3-ols (catechin, epicatechin, epigallocatechin, proanthocyanidin, trans-resveratrol, procyanidin B1, and their polymers) are natural antioxidants (eliminate ROS, RNS, DRA, stimulate detoxifying and antioxidant enzymes) that prevent cell damage and provide other benefits 18,19 .Unfortunately, these bioactive ingredients are present inside the cell in meagre quantities and have a thermolabile structure/character.However, the most sophisticated extraction technique is needed to extract these bioactive compounds from plant sources completely.Compared to modern extraction techniques, traditional methods (such as soxhlation and blending) consume more solvents, take longer time to extraction, and produce less yield of active compounds 20 .
A few advanced extraction techniques, such as microwave-assisted extraction (MAE), pressurized liquid extraction (PLE), ultrasound-aided extraction (UAE), and carbon-dioxide super-critical extraction (CSCE) techniques are followed in the pharmaceutical industries and research laboratories.These advanced extraction techniques are generally called greener and environment-friendly technologies because these processes will consume less energy, permit the use of solvent alternates, renewable natural products, and ensure a safe and high-quality extract/product 21,22 .Carbon-dioxide super-critical extraction is one of the best techniques because it consumes less solvents or eliminates the use of solvents, computer-controllable operations with shorter extraction time, especially thermolabile compounds, are extracted from plant sources without damage 23 .Unfortunately, industries and research laboratories are seeking alternative extraction methods due to the unaffordable cost of carbon-dioxide super-critical extractor 24 .The microwaves from the microwave-assisted extraction (MAE) technique heat the solvent system to enhance the solubility of bioactive compounds of plant cells 25 .While generating heat, it is possible to disintegrate the thermosensitive compounds.This is the major issue for using MAE for thermosensitive compounds 26 .Conversely, ultrasound-aided extraction (UAE) is an exciting and cost-effective alternative for completely extracting plant-derived bioactive ingredients 14 .The UAE method is one of the most preferred extractions, which uses fewer solvents, can be automated at lower temperatures, requires less energy and has a higher yield.It also takes less time to extract the bioactive ingredients.Ultrasonic vibrations accelerate the release of extractable components into the solvent by enhancing mass transport.They also cause rupture of the plant cells by creating physical pressure during ultrasound cavitation 7 .
The present study aimed to maximize the extraction of bioactive ingredients from grape seeds using an ultrasound-aided extraction technique.Many extraction parameters, such as particle size, extraction solvent, solvent concentration, solid-to-liquid ratio, temperature, ultrasonic exposure time, ultrasound intensity, pulse cycle/mode, pH, etc., have potentially influenced the yield of bioactive ingredients and their free radicals scavenging properties 27 .Combining these criteria resulted in the highest yield of bioactive compounds from plant sources, even though they fundamentally appeared unrelated 28 .Combining these parameters must be optimized to achieve the maximum yield of bioactive ingredients 29 .Under this condition, a statistical method of optimization is helpful.One effective method frequently considered for this purpose is the response surface methodology (RSM) 30 .RSM is a statistical technique that determines and simultaneously solves multivariant equations using quantitative data from relevant studies 31 .In RSM, a second-order polynomial equation is applied for modeling and optimization 32 .RSM can be utilized to compare theoretical and actual variables involved in the process with the help of second-order polynomial equations generated in the experiment 33 .Several methods, namely, central composite design (CCD), Box-Behnken design (BBD) and three-level full factorial design (FBD) have been widely applied for RSM to obtain an optimized extraction of bioactive polyphenolic compounds from natural sources 34 .One of the designs used for the application of RSM is CCD, which provides viable models for processes 35 .In contrast to the Box-Behnken design, which is made up of rotated lower-dimensional designs and estimates all linear effects, quadratic effects, and two-way interactions, the CCD is made up of a cube part that is a full factorial that determines main and interaction effects and a star design (α) that quantifies main and quadratic effects 36 .It does not allow for reductions in design, being much less flexible than the CCD.The design space is devoid of any corner points.The factorial portion of the design, which generates rotability, defines the design space box, and the axial points are outside of it.This makes it possible to estimate the expected response www.nature.com/scientificreports/with equal variance in any direction with respect to the design space's centre.Therefore, many researchers have used the central composite design (CCD) to extract bioactive compounds and antioxidants 37 .The (adaptive neuro-fuzzy inference system) and Machine learning algorithm approaches are also used to predict the optimal conditions, producing the best results for nonlinear systems 38 .ANFIS simulates human thought processes using highly developed fuzzy and neural network computer systems 39 .An intelligent neuro-fuzzy method called ANFIS is used to study how variables interact and have nonlinear effects 40 .ANFIS, a hybrid intelligent system 41 .Consequently, multivariable related ambiguous relationships can be quantified using ANFIS through the defuzzification process of the fuzzy inference system (FIS), and error is adjusted for dependable prediction using a backpropagation algorithm with a hidden layer of an artificial neural network (ANN) 42 .Further, a machine learning algorithm adapts the most effective extraction parameters.The aim of machine learning algorithm-based optimization is to reduce the degree of error in a machine learning model, improving its accuracy in making data predictions.Machine learning is generally used to learn the underlying relationship between input and output responses, which is learned from a set of training data 43 .When confronted with new data in a live environment, the model can use this learned approximated function to predict an outcome from this new data.Optimization algorithms can make this process more efficient than any manual process.These algorithms optimize a machine learning model iteratively using mathematical models 44 .The Random Forest Regressor is a simple and widely used algorithm in machine learning 45 .For the best combinations, every hyperparameter configuration is randomly searched and combined.Ultimately, the bioactive ingredients have been recognized using GC-MS (gas chromatography-mass spectrometry) and LC-MS (liquid chromatography-mass spectrometry), indicating the potential of the chosen grape seeds to be used in the healthcare sectors.

Grape seeds
Fresh fruits of grapes "Cumbum Panneer Thratchai" (Vitis vinifera L., family: Vitaceae; Muscat Hamburg variety) personally collected as a gift sample from grapes farm, Cumbum valley (latitude: 9.734426°, and longitude: 77.280739°), Cumbum (is known as the 'Grapes city of South India'), Theni District, Tamilnadu, India, on April 2023.The plant material was collected with the consent of the Swamy Vivekanandha College of Pharmacy, Tiruchengode, Tamilnadu, India.No further regulation was required for the collection of this plant.In addition, the collection of plant material complied with the relevant institutional (Swamy Vivekanandha College of Pharmacy), national, and international guidelines and legislation.A pharmacognosist, Professor Murugananthan Gopal M. Pharm., PhD., Principal, Department of Pharmacognosy, Swamy Vivekanandha College of Pharmacy, Tiruchengode, Tamilnadu, India was authenticated grape seeds collected Cumbum (Specimen Number: VV/ HER/COG-002).It was deposited at the herbarium, Swamy Vivekanandha College of Pharmacy, Tiruchengode, Tamilnadu, India.The grape seeds were removed from the fruits, and then the separated seeds were washed with tap water and shade-dried for seven days.Thoroughly dried grape seeds were ground into a fine powder using a kitchen mixer grinder (Butterfly Gandhimathi Home Appliances Ltd., Chennai, India).Then, the powdered sample was screened into specified particle size powders (0.15, 0.5, 0.75, 1.0, and 1.35 mm) using an exact sieve (Mesh No. 100, 35, 20, 18, and 14).The powder samples were stored under an airtight container until the start of the experiment (moisture content: 10 ± 2%).

Ultrasound-aided solvent extractor set-up
The extraction of bioactive ingredients from grape seed powder was performed with an ultrasonic bath extractor (designed by PCI Analytics Ltd., Mumbai, India) having an optimal capacity of 250 mL.The ultrasonic bath has a temperature controller device (accuracy of ± 1.0 °C) with degassing, ultrasound power (220 V), 33 ± 3 kHz operating frequency, continuous mode at 40 kHz high-intensity ultrasound processor, and input voltage range between 200VAC-230VAC, single phase.The ultrasound-instrument produced maximum ultrasound intensity (220 W cm −2 ) with 25 mm titanium probe.

Preliminary experiments for selection of suitable solvent system
The preliminary experiments were performed to identify the best solvent system for maximum yield of bioactive ingredients based on the highest total phenolic content (TPC), total flavonoid content (TFC), and %DPPH*sc from grape seeds extract.Eight solvents, namely ethanol, methanol, chloroform, petroleum ether, ethyl acetate, diethyl ether, acetone, and n-hexane were selected for this investigation.Each solvent system of the extraction process involved using 2 g of grape seed powder (particle size 0.5 mm), 10 mL of fixed solvent concentration (70% V/V in distilled water), ultrasound intensity: 60 W cm −2 , ultrasound exposure time: 10 min, and temperature: 40 °C.Using a UV-Visible spectrophotometer, Shimadzu UV-1800 series, and UV Probe 2.62 software, Japan, to measure the concentration of bioactive ingredients from grape seed extract.A rotary vacuum dryer (Buchi www.nature.com/scientificreports/rotary evaporator, Mumbai, India) was used to concentrate the extracts.The concentrated extracts were then lyophilized (freeze dryer) to convert into powder form and stored in a desiccator until the experiment.

Procedure for ultrasound-aided extraction (UAE) of bioactive ingredients
The extraction of bioactive ingredients from grape seeds powder was performed using an adjustable ultrasonic bath extractor with a sample of 2 g grape seeds powder in a closed container, which contained 10 mL solvent (miscible methanol in water), and specified particle size, ultrasound exposure time, temperature and ultrasound intensity.According to the Response Surface Methodology's (RSM) central composite design (CCD), experiments were conducted in triplicate.After UAE, the extracts were filtered using Whatman No. 1 filter paper and filtrate was centrifuged at 3500 rpm for 30 min at 4 °C.The supernatant liquid (extract) was concentrated at 40 °C using a rotary vacuum dryer.The concentrated methanolic extract was then lyophilized (freeze dryer) to convert it into powder form to determine the TPC, TFC, and antioxidant potentials (%DPPH*sc, %ABTS*sc, and FRAP).

Determination of total phenolic content (TPC)
The spectrophotometric analysis determined the quantity of TPC (y 1 ) present in the extract according to the previously described method 46 .Briefly, 0.2 mL of grape seeds extract was mixed individually with 5 mL of 10% resolubilized Folin-Ciocalteu reagent. 2 min vortex the mixture, and 2 mL of 7.5% sodium carbonate (Na 2 CO 3 ) was added to the mixture after 5 min.The samples were kept in the dark at room temperature for an hour.They used a UV-Visible spectrophotometer, Shimadzu UV-1800 series, and UV Probe 2.62 software, Japan, to measure the absorbance at 765 nm.The results were presented as mg of gallic acid equivalent (GAE) per gram of sample using gallic acid as the reference standard.

Determination of total flavonoid content (TFC)
The TFC (y 2 ) was determined by the method developed by da Silva et al. 47 . 1 mL of resolubilized extract sample mixed with 0.3 mL of 5% sodium nitrite.The mixture was allowed to incubate for 6 min in a dark place, and then 0.3 mL of 10% aluminium chloride solution was added.3 mL of 1 M sodium hydroxide was added to the reaction, and the incubation was continued for 10 min.After 10 min, a UV-Visible spectrophotometer was employed to measure the absorbance at 510 nm.The results were presented as mg of rutin equivalent (RE) per gram of sample.

%DPPH scavenging assay
The DPPH free radical scavenging potential (y 3 ) of grape seed extracts was determined according to Musa et al. 48.Concisely, 3 mL of DPPH free radical solution (0.1 mM DPPH in ethanol) was mixed with 0.1 mL of grapes seeds extract; the mixture was then incubated at 37 °C for 30 min.After incubation, a UV-Visible spectrophotometer was used to quantify the absorbance at 517 nm.Methanol and DPPH were both employed as controls.The % DPPH radical scavenging ability was calculated as Eq. ( 1): where A 0 -absorbance of the control and A 1 -absorbance of the sample.

%ABTS scavenging assay
The ABTS free radical scavenging potential (y 4 ) of grape seed extract was determined according to Canabady-Rochelle et al. 49 .Concisely, 2.45 mM of potassium persulfate was mixed with 7 mM ABTS radical solution, and the resultant reaction mixture was stored in the dark for 16 h at room temperature.Then, ethanol was used to adjust the reaction mixture's absorbance to 0.70 ± 0.05 at 734 nm. 1 mL of this reaction mixture was mixed with 10 µL of grape seed extract.The absorbance was measured against the blank reagent at 734 nm.The inhibition activity was determined by the following Eq.( 2): where A0-absorbance of the control and A1-absorbance of the sample.

The ferric-reducing antioxidant potential (FRAP) assay
The FRAP (y5) of grapes seed extract was carried out based on the FRAP technique 50 and modified by Pulido et al. 51 .The FRAP reagent was prepared using 300 mM acetate buffer (3.1 g sodium acetate in 16 mL acetic acid at pH 3.6), 10 mmol TPTZ, and 20 mmol FeCl 3 •6H 2 O and 4mMol hydrochloric acid in the ratio of 10:1:1.0.1 mL of grapes seed extract was mixed with 3.0 mL FRAP reagent and incubated in darkness for 30 min at 37 °C.The absorbance was read at 595 nm using a UV-Vis Spectrophotometer.The standard curve was linear through 200 and 1000 μM FeSO 4 .Results calculated in μM Fe (II)/g dry mass were compared with ascorbic acid as a standard.

Experimental design and optimization using RSM
A five-level, five-coded variable central composite design (CCD) in Response surface methodology (RSM) was applied to optimize the effective extraction parameters of ultrasound-aided extraction technique concerning TPC, TFC, and antioxidant potentials (DPPH*sc, ABTS*sc, and FRAP) from grape seed extract.The selected five-coded variables were particle size (X 1 : 0.5-1 mm), methanol concentration (X 2 : 60-70% in distilled water), ultrasound exposure time (X 3 : 18-28 min), temperature (X 4 : 35-45 °C), and ultrasound intensity (X 5 : 65-75 W cm −2 ) at five levels deficient (− 2), low (− 1), medium (0), high (+ 1), and very high (+ 2) investigated for maximum yield of bioactive ingredients from grape seeds extract.The independent variables were coded based on the below Eq. ( 3): where xi is the dimensionless value of the independent parameter; Xi, is the actual value of an independent parameter; Xz, is the actual value of an independent parameter at the central point; and ΔXi, is the step change of the actual value of the parameter i representing to a variation of a unit for the dimensionless value of the parameter i.The total number of experiments was calculated from Eq. ( 4), which is given below: where N is the total number of experiments, k is the independent variable number, and n 0 is the replicate number at the central points, resulting in an experimental design of 50 runs.Fitting experimental data determined the correlation between the dependent and independent variables in a second-order polynomial regression model.
In the case of these 50 experimental runs comprised of 32 factorial points, 8 repeated levels of central points, and 10 axial points (α) at a distance of ± 2 from centre points is shown in Table 2.The results of the CCD studies (Table 2) were analysed employing the multiple regression equation.
The above Eq.( 5) could be converted, which is given below based on the value of five variables, where Y is the dependent response, α 0 is the coefficients-constant of the intercept, αi is linear, αii is quadratic, and αij are interaction terms.Xi and Xj are coded values of independent variables of particle size (X 1 ), methanol concentration (X 2 ), ultrasound exposure time (X 3 ), temperature (X 4 ), and ultrasound intensity (X 5 ), and ε is an error.Model significance (p value), coefficient of determination (R 2 ), predicted coefficient of determination (R 2 pred), adjusted coefficient of determination (R 2 adj), and the adequacy of the models by the statistic lack-of-fit value were all determined by analysis of variance (ANOVA).Only significant coefficients (p < 0.05) or those necessary for the hierarchy were considered when creating the models.Further analysis was performed to determine the accuracy of the extraction parameters.
(1) % DPPH radical scavenging activity = ((A 0 − A 1 ) × 100)/A 0 , (2) % ABTS radical inhibition activity = ((A0 − A1) × 100)/A0,    Adaptive neuro-fuzzy inference system (ANFIS) has an advantage over artificial neural networks (ANN) because it combines the best features of neural networks and fuzzy logic to model complex systems more accurately and precisely 52 .They are inspired by the properties of biological neural networks that resemble the human brain; these networks learn from experience and are used in data processing for categorization and prediction 53 .It is also suitable for various applications, such as optimizing significant extraction variables, thanks to its ability to analyse both numerical and linguistic data 54 .Additionally, merging ANN with fuzzy-set theory helps address the benefits and limitations of both approaches 55 .Jang can develop the intelligent computer tool ANFIS, which can be used to solve complex and nonlinear issues 56 .Both linear and nonlinear relationships between input and output responses can be analysed with this method 57 .This method employs a rule-based fuzzy logic model, which is trained with the help of rules generated during the procedure 58 .Data is used to inform the training process 59 .Furthermore, training datasets are provided by least squares and backpropagation modeling in this system.Backpropagation of ANN is used as the first step in training data for the adaptive network-based fuzzy inference system (ANFIS) 60 .The output response of ANN will then be used to fuzzy logic membership functions as the input parameters of particle size (X 1 ), methanol concentration (X 2 ), ultrasound exposure time (X 3 ), temperature (X 4 ), and ultrasound intensity (X 5 ).These variable optimizations are performed with greater precision thanks to the fuzzy inference system (FIS).The backpropagation algorithm is employed as the initial training strategy in the adaptive network-based fuzzy inference system (ANFIS) for data training purposes.The input parameters of particle size (X 1 ), methanol concentration (X 2 ), ultrasound exposure time (X 3 ), temperature (X 4 ), and ultrasound intensity (X 5 ) will be utilized as input variables for the artificial neural network (ANN).
The output response of the ANN will subsequently be employed in the application of fuzzy logic membership functions.Utilizing the fuzzy inference system (FIS) enhances the accuracy of the optimizations pertaining to these variables.The analysis of each predicted output responses of yield of TPC, TFC and percentage antioxidant scavenging potential (DPPH*sc, ABTS*sc, and FRAP) was done using the optimization of ANFIS modeling and data from similar CCD of RSM experiments.In this study, the Sugeno-type fuzzy inference model was employed for the ANFIS modeling to get multiple inputs (X 1 , X 2 , X 3 , X 4 , and X 5 ) and a single output response (y1/y2/y3/y4/y5).Because the Sugeno-type fuzzy inference system is more computationally efficient than the Mamdani type.The Mamdani type depends more on specialized knowledge.Nonetheless, actual data is used to train the Sugeno type.The ANFIS architecture (Fig. 1) shows the design displayed multiple inputs and a single output response at a time.A Sugeno-Fuzzy Inference System (FIS) has one output response, "z, " and two inputs, "x" and "y".Two fuzzy if-then rules for a first-order Sugeno fuzzy model can be expressed as follows: Rule 1: If x is A 1 and y is B 1 , then f 1 = p 1 x + q 1 y + r 1 , Rule 2: If x is A 2 and y is B 2 , then f 2 = p 2 x + q 2 y + r 2 , where A 1 and B 1 are the fuzzy sets, f 1 is the output response, and p 1 , q 1 , and r 1 are the design parameters determined during the training process.The number of membership functions for each given input variable was determined by a procedure of trial and error.To predict the outcome of the extraction of the majority of bioactive components from grape seeds extract, the experimental data was divided into training, testing, and validation of the network model using MATLAB v. R2013a Fuzzy logic toolbox.

Optimization using machine learning algorithm
The technique of continuously increasing a machine learning model's accuracy and decreasing its error rate is known as machine learning optimization 61 .Most machine learning models use training data to understand the link between input and output responses.After this, the models can be applied to categorize fresh incoming data or predict trends.Since the target values of the experiment are continuous, a Random Forest Regressor is employed 62 .This ensemble learning approach combines different decision trees to create a more accurate model 63 .A subset of the training data and a randomly selected subset of the characteristics are used to train each decision tree.This randomization helps to improve the model's prediction accuracy and reduce overfitting.The Random Forest algorithm generates many decision trees and then aggregates their forecasts to obtain a more accurate and reliable prediction 64 .The primary benefit of the random forest regressor is its ability to handle high-dimensional data with a nonlinear relationship between the input and target values.The data set for this investigation has experimental values X 1 , X 2 , X 3 , X 4 , and X 5 as inputs and y 1 , y 2 , y 3 , y 4 , and y 5 as output responses.In order to generate the predictions, the experimental values are the dataset that is initially imported.It is then preprocessed to see if there are any missing or noisy data in the dataset.

Verification optimized condition
By evaluating the dependability of the optimization findings, the applicability of the experiment was confirmed.Under optimal conditions, a triplicate experiment was carried out based on the combination of response and minor deviation.The mean experimental results were compared to assess the model's reasonableness about the predicted values.

Utilizing GC-MS for volatile bioactive compounds identification
Gas chromatography and mass spectrometry (GC-MS) was used to determine the volatile nature of the bioactive components in the optimized grape seed extract 7 .The GC-MS analysis was performed in gas chromatography that also served as a mass spectrophotometer (Shimadzu Make QP-2010 GC-MS system), equipped with a non-polar 60 M RTX 5MS column and helium gas as the carrier gas, with a constant pressure of 15 psi and an adjusted column flow velocity of 1.00 mL × min −1 with initial oven temperature at 40 °C held for 3 min and the final temperature of the oven was 480 °C.with the rate at 10 °C [min × sup −1 ].A 2 μL sample was injected with split-less mode.Mass spectra was recorded over 35-650 amu range with electron impact ionization energy 70 eV.The total running time for a sample is 45 min.Identification of bioactive ingredients was achieved based on their retention time of chromatographic peaks utilizing a Quadrapole detector and the NIST 2014 (National Institute of Standards and Technology, 2014) library to relative retention indices.NIST library database contains more than 62,000 patterns of well-known compounds.The spectra of the unknown bioactive ingredients of grape seeds extract fraction obtained were compared to the reference mass spectra of recognised components deposited in the NIST library collection (NIST).

Utilizing LC-MS for non-volatile bioactive compounds identification
Liquid chromatography and mass spectrometry (LC-MS) was used to determine the non-volatile nature of the bioactive components in the optimized grape seed extract.The LC-MS analysis was performed using the 1290 Infinity UHPLC System, and 6550 iFunnel Q-TOFs (Agilent Technologies, USA).For separations Zorbax-SB-C-18 column (2.1 × 50 mm, 1.8 µM particle size, Agilent Technologies, USA).Two mobile phases were used: A-0.1% formic acid in water and B-90% of acetonitrile in water, at a flow rate of 500 µL min −1 .The LC conditions were maintained at 5% at 0-3 min in B, a linear increase from 5 to 20% between 3 and 25 min, 20 to 40% during 25-40 min, and from 40 to 50% between 40 and 55 min, finally, it reached 50 to 95% at 55-63 min.The peak detection was performed through direct injection mode with an Electron Spray Ionization (ESI) probe, both positive and negative modes.The non-volatile nature of bioactive compounds was identified by obtaining the molecular mass and structural formula of compounds with the help of online libraries.

Statistical analysis
All the experiments were performed based on RSM's CCD and repeated three times.Design expert software (trial version 8.0.7.1, Stat-Ease, Inc., 2021 East Hennepin Ave, Suite 480, Minneapolis, MN 55413, USA) used for the experimental design, optimization, data analysis, prediction, and quadratic model building.In regression model, the goodness of fit was evaluated based on the R 2 (coefficients of determination).Further, the statistical analysis was assessed by one-way analysis of variance (ANOVA), with p-values less than 0.05 considered significant.The optimal extraction conditions were analysed by contour plots and three-dimensional (3D) response surface plots.Microsoft Excel (Microsoft Office Professional Plus 2021) was used for statistical analysis and the Adaptive neuro-fuzzy logic toolbox in the MATLAB (Mathematical Laboratory) v R2013b software.

Adequacy of the models
Ultrasound-aided extraction is one of the best techniques for extracting thermosensitive and minute bioactive ingredients from natural sources.This ultrasound-aided extraction delivers numerous advantages, such as consuming less solvent as well as energy for extraction, heat not generated during ultrasonic waves rupturing the cell wall, and ultrasonic waves very quickly breaking the cell wall and solubilizing the internal active ingredients by the solvent 65 .This study successfully optimized independent extraction variables through the CCD of RSM.This quadratic model was employed by combining ultrasound-aided extraction parameters of linear, interactions, and quadratic impacts on grape seed extract's maximal extraction yield of bioactive compounds 66 .CCD was flexible and effective, and could provide much information about experimental variables and errors with the least experimental cycle 67 .Therefore, several experiments were carried out according to the central experiment design (CCD).Table 2 presented the experimental values and their predicted TPC, TFC, and antioxidant scavenging potentials (%DPPH*sc, %ABTS*sc, and FRAP) values of grapes seed extract under combination extraction parameters.Based on the experimental results, the optimized condition was observed at 0.155 mm particle size (X 1 ), 65% methanol concentration (X 2 ), 23 min ultrasound exposure time (X 3 ), temperature (X 4 ) at 40 °C, and ultrasound intensity (X 5 ) was 70 W cm −2 .This situation, the optimal yields of TPC, TFC, and antioxidant scavenging potential were achieved to be 670.32 mg GAE/g, 451.45 mg RE/g, 81.23% DPPH*sc, 77.39% ABTS*sc and 71.55 μg mol (Fe(II))/g FRAP value (Raw data was presented in Related file).The interaction of extraction parameters may be the primary factor causing the maximum production of bioactive components from grape seeds.Similar results were obtained in our previous optimization experiments, such as bioactive compounds from Garcinia indica, and Hemidesmus indicus Linn. of ultrasound-assisted extraction 7,14 .The experimental results fitted to the model of a second-order polynomial equation.ANOVA was used to analyse the regression equation that was obtained.The significance of the coefficients was ascertained at a 95% confidence level using the F and p tests.The associated variables would become highly significant if the p-value decreases and the F-value increases 68 .
The p-values were used as an essential tool to check the significance of the interactions of the variables 69 .Importantly, when the p-value was < 0.05, then the model terms are assigned as statistically significant.While, the p-value was greater than 0.05 the model terms are called non-significant.The obtained F value for the lack of fit in this investigation was 18.49, and the present model was therefore highly significant (p-value 0.0001 and F value 18.49).The multiple regression coefficient of determination (R 2 ) determines the model's output response and the importance of lack-of-fit.The adequacy-output response model revealed that the quadratic model multiple regression coefficient of determination (R 2 ) of TPC, TFC, and antioxidant scavenging potentials (DPPH*sc, ABTS*sc, and FRAP) were 0.9273, 0.9323, 0.9045, 0.8730, and 0.8800, respectively, which demonstrated good depiction of the variables by the model and satisfied Le Man et al.For this reason, a model is considered acceptable when R 2 > 0.87.The predicted R 2 value (0.6930) is close to the adjusted R 2 value (0.8772), and the 95% confidence level shows that the quadratic models fit the experimental data well.This implied that 95% of the experimental values agree with the model's predictions.

Total phenolic content (TPC)
The experimental results and their predicted values of TPC using various combinations of extraction parameters in the ultrasound-aided solvent extraction method are presented in Table 3.Using the obtained experimental data, an analysis of variance (ANOVA) was performed to determine the coefficient of determination (R 2 ) of the model's significance.The statistical significance of the model equation was evaluated using the lack-of-fit test, coefficient of determination (R 2 ), and p-values 70,71 .From the analyzed data in Table 3 and polynomial Eq. ( 7), it was determined that the linear term of particle size has a substantial (p 0.05) contribution to the most significant yield of total phenolic content (X 1 ), temperature (X 4 ), and quadratic term X 2 2 , X 3 2 , X 5 2 .As the ANOVA result in this model illustrates in Table 3, the model could reflect the relationship between the experimental values and their predicted responses with a higher F-value (18.49), and a very low probability value (p < 0.0001).Additionally, sufficient precision and the coefficient of determination (R 2 ) were significant markers of the model fitting.An R 2 value near one indicates that the suggested model provides a better explanation for the variability of the experimental data; in other words, there is a stronger correlation between the observed and predicted values 72 .The coefficient of determination of the ultrasound-aided solvent extraction of total polyphenolics was found to be R 2 = 0.9273, R 2 predicted = 0.6930, and R 2 adjusted = 0.8772, which indicated that this model has good reliability and fitting.The second-order polynomial equation for the fitted quadratic model for TPC in coded variables is given below Eq. ( 6) The residuals were subsequently examined using the model data.Using residuals, it was possible to determine the difference between an experimental value from a response surface measurement and the value that the model anticipated.Figure 2a displays the studentized residuals of X 1 , X 2 , X 3 , X 4 , and X 5 as a normal percent probability plot.These discovered variations do not deviate from the usual distribution.A model that fits the data well is indicated by a high coefficient of determination (R 2 >> 0.9), as seen in Fig. 2b. Figure 2c,d show 3D response surface and 2D contour plot reveal the significant effect of particle size (X 1 ) and temperature (X 4 ) in maximizing yield of TPC with methanol concentration, ultrasonic exposure time and ultrasonic intensity held at a fixed level (zero level) = 65%, 23 min, 70 W cm −2 , respectively).The effects of particle size (X 1 ), temperature (X 4 ), and the extract's highest content of total polyphenolic content was further illustrated by the 3D response surface and contour plot in Fig. 2c,d.The total polyphenolic concentrations ranged from 360.76 to 670.32 mg GAE/g, as Table 2 demonstrates.The maximum yield of total polyphenolic content was achieved at 0.155 mm particle size, 65% methanol concentration, 23 min ultrasonic exposure time, temperature at 40 °C, and 60 W cm −2 ultrasonic intensity.

Total flavonoid content (TFC)
From the ANOVA Table 3 and obtained second-order polynomial Eq. ( 8) illustrated that the linear term particle size (X 1 ) and temperature (X 4 ) and quadratic term X 2 2 , X 3 2 , and X 5 2 are influencing significant (p << 0.05) effects for the maximum extraction yield of total flavonoid content from grape seeds extract.The effect of other terms was found to be non-significant because p value was greater than 0.05.The experimental model successfully fits the data, as evidenced by the response surface analysis of the total flavonoid content of the extract, which revealed a high coefficient of determination value of R 2 = 0.9323, adjusted R 2 = 0.8856, predicted R 2 = 0.7567 and ( 6)   www.nature.com/scientificreports/ Figure 2e illustrates normal % probability plot of studentized residuals of X 1 , X 2 , X 3 , X 4 and X 5 .These variants are normally distributed without any deviations.Coefficient of determination (R 2 ) value should be close 0.9 to have a good fit of the model.The closer the goodness of fit to 1, the better the empirical model fits the actual data 73 .Figure 2f displayed the high coefficient of determination values (R 2 >> 0.9), which are indicative of a strong fit.Furthermore, the maximum content of total flavonoids in the grapes seeds extract is influenced by particle size (X 1 ) and temperature (X 4 ).In contrast, other three variables, such as methanol concentration (X 2 ), ultrasonic time (X 3 ), and ultrasonic intensity (X 5 ) were kept constant (zero level) = 65%, 23 min, 70 W cm −2 , respectively, as shown by the 3D response surface and contour plot in Fig. 2g,h.Table 2 showed a range of 243.17 to 451.45 mg (RE)/g for total flavonoids.The highest yield of flavonoids was produced with 0.155 mm particle size, 65% methanol concentration, 23 min ultrasonic exposure time, temperature at 40 °C and 60 W cm −2 ultrasonic intensity; the lowest content was produced with 1.35 mm particle size, 65% methanol concentration, 23 min ultrasonic exposure time, temperature at 40 °C and 70 W cm −2 ultrasonic intensity.

Antioxidant scavenging potentials (%DPPH*sc, %ABTS*sc and FRAP)
Based on the statistical analysis of experimental data in Table 3 and second-order polynomial Eqs. ( 8)- (10), the linear term X 1 interaction terms X 1 X 4 , X 3 X 5 , and quadratic terms X 1 2 , X 2 2 are significantly (p < 0.05) contributing to the effects for the maximum yield of all three antioxidants (%DPPH*sc, %ABTS*sc, FRAP) scavenging potential from grape seeds extract.In addition, Table 3 shows the interaction terms X 2 X 4 , X 4 X 5 contributing the highest DPPH* scavenging activity of grapes seeds extract.Similarly, linear term X 4 , and interaction term X 3 X 5 has a significant effect on ABTS*sc and FRAP.The coefficient of determination (R 2 ) value of %DPPH*sc, %ABTS*sc, and FRAP are 0.9045, 0.8730, 0.8800 respectively, adjusted R 2 value of %DPPH*sc, %ABTS*sc, and FRAP are 0.8386, 0.7855, 0.7973 respectively, the predicted R 2 value of %DPPH*sc, %ABTS*sc, and FRAP are 0.6661, 0.5611, 0.5789, respectively.All three antioxidant potentials adjusted R 2 values very close to predicted R 2 , with Figure 3. Normal percentage probability plot for the studentized residuals for highest yield of %DPPHsc (a), %ABTSsc (e) and FRAP (i).Relationship between experimental and predicted value for highest yield of %DPPHsc (b), %ABTSsc (f) and FRAP (j).Response surface and contour plot showing the combined effects of methanol concentration (X 1 ) and temperature (X 2 ) for highest yield of %DPPHsc, %ABTSsc and FRAP when time and particle size were held at fixed level (c,g,k,d,h,l), respectively.the least lack of fit p value of %DPPH*sc, %ABTS*sc, and FRAP < 0.0001, < 0.0001, and < 0.0001, respectively.These observed data suggested that the model is significantly accurate.The second-order polynomial equation for the fitted quadratic models for %DPPH*sc, %ABTS*sc, and FRAP in coded variables is given in Eqs. ( 8)- (10).Figure 3a,e,i shows that the normal percentage probability plot of studentized residuals of X 1 , X 2 , X 3 and X 4 and these variants are normally distributed and have no deviation for all three antioxidant scavenging experiments.Figure 3b,f,j displayed the high coefficient of determination (R 2 >> 0.87), which are indicative of a strong fit.The 3D response surfaces and 2D contour plots for antioxidant scavenging potentials (%DPPH*sc, %ABTS*sc and FRAP) as responsible functional variables of particle size (X 1 ) and temperature (X 4 ) are shown in Fig. 3c,d,g,h,k,l.The figures show that 0.155 mm particle size, 65% methanol concentration, 23 min ultrasonic exposure time, temperature at 40 °C and 60 W cm −2 ultrasonic intensity correspond to the highest antioxidant (%DPPH*sc, %ABTS*sc, and FRAP) potential.The highest yields of antioxidant scavenging potentials are %DPPH 81.23%, %ABTS 77.39%, and FRAP 71.55 μg mol Fe (II)/g.

ANFIS modelling
ANFIS modelling was used to investigate further verify experimental data and predict the extraction variables of bioactive ingredients in the grape seeds extract.The same 50 experimental data sets shown in Table 2 were divided into three sets to develop the ANFIS model prediction: 65% for the training data sets, 30% for the testing data sets, and 5% for validating the models.These sets were then used to construct a fuzzy inference system, the parameters of which were adjusted for the membership function using the least-squares method in conjunction with the back-propagation algorithm.The fuzzy logic toolbox in MATLAB v. R2013a was used to train ANFIS to obtain the results.To ensure accuracy, a FIS of ANFIS model with membership functions, five output responses, and five input responses must be constructed.The proposed architecture of the ANFIS model comprises five input parameters and one output value, as displayed in Fig. 1.Several parameters must be verified one at a time.For every input variable, including particle size (X 1 ), methanol concentration (X 2 ), ultrasound exposure time (X 3 ), temperature (X 4 ), and ultrasound intensity (X 5 ), there are three fuzzy sets: low, medium, and high.Similarly, experimental results on predicted output responses were TPC (670 mg gallic acid equivalents (GAE)/g), TFC (451 mg rutin equivalents (RE)/g), DPPH*sc (81.2%),ABTS*sc (77.4%), and FRAP (71.6 μg mol (Fe (II))/g) were defined in five fuzzy sets namely very low, low, medium, high and very high.Experiment data and human observation data were utilized to construct the fuzzy rule.The fuzzy inference system had a total number of fuzzy rules 324 and a number of network nodes 664 (Number of input response 5, output response 1 (at a time), and the type of membership function is Gaussian) presented.The predicted values of the responses were utilized to improve the fuzzy rules through RSM.

Machine learning algorithm
The inputs are the characteristics of the experimental parameters (X 1 , X 2 , X 3 , X 4 , and X 5 ), and the output responses are y 1 , y 2 , y 3 , y 4 , and y 5 .The dataset contains the five goal columns.Thus, the five random forest regressor models were constructed by maintaining the input data constant and changing the output response for each model.This experiment's estimators are set to 100.The R error value is used to evaluate the models after they have been fitted to the training set of data.Subsequently, the models predict the input data (X 1 : 0.1554 mm particle size, X 2 : 65% methanol concentration, X 3 : 23 min, X 4 : 40 °C, and X 5 : 70 W cm −2 ultrasound intensity).Total polyphenolics (643.53 mg GAE/g), total flavonoids (411.64 mg RE/g), %DPPH*sc (76.84%), %ABTS*sc (71.12%), and FRAP (66.30 μg mol (Fe (II))/g) were all expected to have the desired output responses based on the experimental results.Figure 4a-e are created for each of the five models to illustrate the error variance between the predicted and actual values.

GC-MS analysis
A total of 20 peaks were observed from optimally obtained grape seed extract of the GC-MS chromatogram (Fig. 6) by comparing the peak retention time, peak area (%), height (%), and mass spectral fragmentation patterns to those of the well-known compounds listed in the National Institute of Standards and Technology (NIST) library.Among the 20 peaks, 12 bioactive compounds (based on the active nucleus of the structure) were identified.Table 5 shows the identified bioactive compounds and their molecular formula, with molecular mass. https://doi.org/10.1038/s41598-023-49839-ywww.nature.com/scientificreports/

Figure 1 .
Figure 1.The architecture of the ANFIS input and output response model.

Figure 2 .
Figure 2. Normal percentage probability plot for the studentized residuals for highest yield of TPC (a), and TFC (e).Relationship between experimental and predicted value for highest yield of TPC (b), and TFC (f), Response surface and contour plot showing the combined effects of methanol concentration (X 1 ) and temperature (X 2 ) for highest yield of TPC, and TFC, when time and particle size were held at fixed level (c,g,d,h), respectively.

( 8 )
%DPPH * sc y 3 = 72.28− 7.81X 1 https://doi.org/10.1038/s41598-023-49839-ywww.nature.com/scientificreports/Verification of the modelThe obtained optimized extraction condition based on the CCD of RSM was confirmed with verification experiments for maximum yield of bioactive ingredients from grapes seeds extract.The significantly influenced parameters' values slightly changed, and verification experiments were performed individually.The obtained verification experimental results feed into the Design Espert software and analyse the verification experimental results and their predicted output responses based on the yield of TPC, TFC, and antioxidant scavenging potentials (%DPPH*sc, %ABTS*sc and FRAP) from grapes seeds.ANFIS and the machine learning algorithm used the same data for further verification.The verification experimental results exhibited that the particle size, methanol concentration, and temperature significantly affected the highest yield of bioactive ingredients from grapes seeds.Table4displays the results of verification experiments conducted under optimized conditions and with minor modifications based on values of extraction parameters.Based on the verification experiment, 0.155 mm particle size of grapes seeds powder, 62.5% of methanol, in 23 min of ultrasonic waves exposure time, at 40 °C with 70 W cm −2 ultrasonic intensity, under this condition while the experimental values of TPC, TFC, and antioxidant scavenging potentials were 672.45 mg GAE/g, 454.65 mg RE/g, 81.89%, 77.85%, and 71.52 μg mol (Fe (II))/g), respectively.Further, the predicted values from RSM models are TPC, TFC, %DPPH*sc, %ABTS*sc and FRAP were 772.64 mg GAE/g, 469.42 mg RE/g, 82.22%, 76.72%, and 71.52 μg mol (Fe (II))/g), respectively.By changing the extraction parameter (input) values, the value of the responses (output) was observed using a rule viewer plot (Fig.5).The rule viewer is a compressed toolbox with built-in neural weight optimization and fuzzification techniques.Implementation experiments and comparing the outcomes with the model's predicted value allowed for additional cross-validation of the model.In the grape seeds extract, the predicted responses obtained through the ANFIS model were TPC, TFC, and antioxidant scavenging potentials (%DPPH*sc, %ABTS*sc, and FRAP) were 632 mg GAE/g, 426 mg RE/g, 76.5%, 72.8%, and 67.3 μg mol (Fe (II))/g), respectively.At the same time, the machine learning algorithm model predicted the responses, the values for TPC, TFC, and antioxidant scavenging potentials (%DPPH*sc, %ABTS*sc, and FRAP) were 669.69 mg GAE/g, 455.11 mg RE/g, 81.18%, 76.93%, and 71.14 μg mol (Fe (II))/g), respectively.According to the findings, RSM, ANFIS modelling, machine learning algorithm predictions, and the experimentally obtained values and regression analyses fit well.

Figure 4 .Table 4 .
Figure 4. Machine learning algorithm validated the experimental and predicted values.

Figure 5 .
Figure 5. ANFIS rule viewer for the effect of extraction parameters on responses for extraction of TPC, TFC and antioxidants from grape seeds extract.

Figure 6 .
Figure 6.GC-MS spectra of optimally optimized extract of grape seeds.List of bioactive phytocompounds presence in the optimally obtained extract.

Table 1 .
Preliminary selection of appropriate extraction solvent.*All the experiments were repeated three times and values are expressed as mean ± standard deviation.

Table 2 .
Central composite design (CCD) with experimental responses and predicted responses.*All the experiments repeated three times.

Table 5 .
Analysed bioactive ingredients from optimized extract of graph seeds through GC-MS chromatogram.